Effective ETL testing detects problems with the source data early on—before it is loaded to the data repository — as well as inconsistencies or ambiguities in business rules intended to guide data transformation and integration. The process can be broken down into eight stages.
Identify business requirements — Design the data model, define business flow, and assess reporting needs based on client expectations. It’s important to start here so the scope of the project is clearly defined, documented, and understood fully by testers.
1: Validate data sources — Perform a data count check and verify that the table and column data type meets specifications of the data model. Make sure check keys are in place and remove duplicate data. If not done correctly, the aggregate report could be inaccurate or misleading.
2: Design test cases — Design ETL mapping scenarios, create SQL scripts, and define transformational rules. It is important to validate the mapping document as well, to ensure it contains all of the information.
3: Extract data from source systems — Execute ETL tests per business requirement. Identify types of bugs or defects encountered during testing and make a report. It is important to detect and reproduce any defects, report, fix the bug, resolve, and close bug report — before continuing to Step 5.
4: Apply transformation logic — Ensure data is transformed to match schema of target data warehouse. Check data threshold, alignment, and validate data flow. This ensures the data type matches the mapping document for each column and table.
5: Load data into target warehouse — Perform a record count check before and after data is moved from staging to the data warehouse. Confirm that invalid data is rejected and that the default values are accepted.
6: Summary report — Verify layout, options, filters and export functionality of summary report. This report lets decision-makers/stakeholders know details and results of the testing process and if any step was not completed i.e. “out of scope” and why.
Test Closure — File test closure.
1: The focus of OLTP application testing is on software code while OLAP application testing is directed at the validation of the correctness of data
2: The volume of data involved in OLAP application testing is typically very large when compared to volume of data involved in the testing of OLTP applications
3: Data integration projects present different set of challenges for testing of full and incremental loads
4: Performance testing of data integration projects presents different set of challenges including the need for large volumes of test data when compared to OLTP applications
5: The number of use cases for OLTP applications are finite while the test scenarios for regression and performance testing of OLAP applications can be virtually unlimited.
> Are the values within the acceptable subset of an encoded list?
> Is the data as per business rules?
> Are there any unexpected duplicates?
> Are there any orphan records or missing foreign keys?
> Are the counts matching between source and the target?
> Is the target data consistent with the source data?
> Is the Fact to dimension table foreign key mapped in the ETL appropriately?
> Test Lookup transformation
> Test Aggregate transformation
> Test Expression transformation
> Regression testing of transformations
Two main types of test on the data warehouse conceptual schema in the scope of functional testing. The first, we call fact test, verifies that the workload preliminary expressed by users during requirement analysis is actually supported by the conceptual schema. The second type of test a conformity test, because it is aimed at assessing how well conformed hierarchies have been designed.
Testing the logical schema before it is implemented and before ETL design can dramatically reduce the impact of errors due to bad logical design. An effective approach to functional testing consists in verifying that a sample of queries in the preliminary workload can correctly be formulated in SQL on the logical schema. We call this the star test.
A functional test of ETL is aimed at checking that ETL procedures correctly extract, clean, transform, and load data into the data mart. The best approach here is to set up unit tests and integration tests. Unit tests are white-box test that each developer carries out on the units developed. Integration test allows the correctness of data flows in ETL procedures to be checked.
Functional testing of the analysis front-ends must necessarily involve a very large number of end-users, who generally are so familiar with application domains that they can detect even the slightest abnormality in data. Nevertheless, wrong results in OLAP analyses may be difficult to recognize. They can be caused not only by faulty ETL procedures, but even by incorrect data aggregations or selections in front-end tools..